Confidence scoring and rejection using multi-pass speech recognition

نویسنده

  • Vincent Vanhoucke
چکیده

This paper presents a computationally efficient method for using multiple speech recognizers in a multi-pass framework to improve the rejection performance of an automatic speech recognition system. A set of criteria is proposed, which determine at run time when rescoring using a second pass is expected to improve the rejection performance. The second pass result is used along with a set of features derived from the first pass to compute a combined confidence score. The feature combination is optimized globally based on training data. The combined system significantly outperforms a simple two-pass system at little more computational cost than comparable one-pass and twopass systems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Prosodic Scoring of Recognition Outputs in the JUPITER Domain

JUPITER is a conversational system that allows users to access weather information over the telephone using natural speech [1]. This work examines the use of prosodic information to predict speech recognition errors more accurately for improved system robustness. Two approaches were explored here. The first approach is based on a probabilistic confidence scoring framework, which uses prosodic c...

متن کامل

Improving performance of an HMM-based ASR system by using monophone-level normalized confidence measure

In this paper, we propose a novel confidence scoring method that is applied to N-best hypotheses output from an HMM-based classifier. In the first pass of the proposed method, the HMM-based classifier with monophone models outputs N-best hypotheses (word candidates) and boundaries of all the monophones in the hypotheses. In the second pass, an SM (Sub-space Method)-based verifier tests the hypo...

متن کامل

Tightly integrated spoken language understanding using word-to-concept translation

This paper discusses an integrated spoken language understanding method using a statistical translation model from words to semantic concepts. The translation model is an N-gram-based model that can easily be integrated with speech recognition. It can be trained using annotated corpora where only sentencelevel alignments between word sequences and concept sets are available, by automatic alignm...

متن کامل

Robust confidence annotation and rejection for continuous speech recognition

We are looking for confidence scoring techniques that perform well on a broad variety of tasks. Our main focus is on word-level error rejection, but most results apply to other scenarios as well. A variation of the Normalized Cross Entropy that is adapted to that purpose is introduced. It is successfully used to automatically select features and optimize the word-level confidence measure on sev...

متن کامل

Automatic Long Audio Alignment and Confidence Scoring for Conversational Arabic Speech

In this paper, a framework for long audio alignment for conversational Arabic speech is proposed. Accurate alignments help in many speech processing tasks such as audio indexing, speech recognizer acoustic model (AM) training, audio summarizing and retrieving, etc. We have collected more than 1,400 hours of conversational Arabic besides the corresponding human generated non-aligned transcriptio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005